Glottal Source Estimation and Automatic Detection of Dysphonic Speakers
نویسندگان
چکیده
Among all the biomedical signals, speech is among the most complex ones since it is produced and received by humans. The extraction and the analysis of the information conveyed by this signal are the basis of many applications, including the topics discussed in this thesis: the estimation of the glottal source and the automatic detection of voice pathologies. In the first part of the thesis, after a presentation of existing methods for the estimation of the glottal source, a focus is made on the occurence of irregular glottal source estimations when the representation based on the Zeros of the Z-Transform (ZZT) is concerned. As this method is sensitive to the location of the analysis window, it is proposed to regularize the estimation by shifting the analysis window around its initial location. The best shift is found by using a dynamic programing algorithm including constraints about the glottal source and the vocal tract response, both being estimated by the ZZT-based method for each shift. Based on the regularized glottal source, characteristic parameters are estimated by finding the best fitting glottal source model. The application of this method on real speech is presented. The second part of the thesis is devoted to the development of automatic methods for the detection of voice pathologies. These pathologies are usually assessed in clinics by means of perceptive and objective analysis. In support to this assessment, there is a need to develop new objective methods in order to detect a pathology or evaluate the voice quality before and after surgery. After a large overview of existing methods in terms of features and classification approaches and a comparison between different methodologies for the features selection, it is investigated to which extent a limited number of features can be combined in a simple classification approach to detect the presence of a pathology. A first application shows that the correlation between acoustic descriptors, which do not require the estimation of fundamental period, is able to discriminate well between normal and pathological sustained vowels. A second application shows the interest of combining the information extracted from the speech signal and the estimation of the glottal source for the detection of voice pathologies. In this application, two features (one computed on the speech signal and the other on the glottal contribution) are selected by means of mutual information-based measure and their distribution for normal and pathological voices is estimated to derive a simple classifier based on Gaussian Mixture Models. The ability of this classification approach to discriminate between normal and pathological sustained vowels is demonstrated and it is proposed to nuance the decision provided by the classifier by including indetermination zones in the normal/pathological decision. These precautions allow to increase the reliability of the decision provided to the clinician.
منابع مشابه
Salience analysis for glottal cycle detection in disordered speech
The presentation concerns the evaluation of a temporal method for tracking cycle lengths in voiced speech. The speech cycles are detected via the saliences of the speech signal samples. The method does not request that the signal is locally periodic and the average period length known a priori. The cycle length extraction is applied to the analysis of dysphonic speakers affected by amyotrophic ...
متن کاملSynthesis of breathy and rough voices with a view to validating perceptual and automatic glottal cycle pattern recognition
The framework of the presentation is the assessment of the ability of human raters or speechprocessing software to detect glottal cycles in speech sounds and measure their lengths in synthetic breathy and rough voices. The synthesis of hoarse voices designates the generation of speech sounds the timbre of which simulates the voice quality of dysphonic speakers. The added value of synthetically ...
متن کاملGlottal Waveforms for Speaker Inference & A Regression Score Post-Processing Method Applicable to General Classification Problems
Contributions are made along two main lines. Firstly a method is proposed for using a regression model to learn relationships within the scores of a machine learning classifier, which can then be applied to future classifier output for the purpose of improving recognition accuracy. The method is termed r-norm and strong empirical results are obtained from its application to several text-indepen...
متن کاملOn the mutual information of glottal source estimation techniques for the automatic detection of speech pathologies
detection of speech pathologies by exploiting the estimation of the glottal source. Three methods of estimation are compared and time and spectral features are extracted. The relevancy of these features is assessed by means of information theory-based measures. This allows an intuitive interpretation in terms of discrimination power and redundancy between the features. It is discussed which fea...
متن کاملAssessment of disordered voices based on an optimized glottal source model
In this paper, a method for the assessment of disordered voices is proposed. A feature named mean opening quotient (MOQ) obtained from the glottal source estimation is used as an acoustic cue to summarize the degree of severity of the voice disorder. The analysis method uses the empirical mode decomposition algorithm to estimate the glottal source excitation signal from the speech signal. The l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011